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1. INTRODUCTION 

Nowadays, computers are used to simulate different types of human functions in the form of digital 
representation. For example, sound synthesizers are commonly run in the digital computer by implementing 
a computer program that produces digital sound samples (waveform) [1]. The algorithm is designed to 
emulate the target sound by tunning certain internal parameters [2]. These parameters are variable determined 
by the user implementation according to the desired sound. The algorithms that generate sound are called 
Sound synthesis techniques (SSTs). The classic SSTs implemented following a traditional mathematical 
technique for emulating the internal parameters. The estimated parameters for designing a functional form of 
SSTs are dependent problems relying on human skills. However, generating sounds is a remarkably difficult 
task, it is slightly inharmonic and the partials process a certain stochastic low-amplitude, high-frequency 
deviation [3]. 

Traditional sound synthesis needs comprehensive human experiences [4] and along with refinement, 
processes to estimate the internal parameters. The motivation behind this research is to find an alternative 
method to synthesis the target sound. Usually, techniques rely on tuning a large number of parameters, up to 
200 or more in some cases, mathematically. However, the synthesizers show that they can non-linearly 
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respond to some parameters. This means any change of one parameter might be effected to another and small 
changing on one parameter can cause a large change in the sound [5]. Currently, these parameters have been 
automatically expressed through solving SST as an optimization problem using artificial evolution (AE) [1]. 
The first attempt to generate complex music sounds using the evolutionary paradigm described by 
Dawkins, 1986 [6]. 

In 1993, Horner et al. [7] presented a type of evolutionary method which is Genetic algorithms 
(GAs) to estimate the internal parameter of FM synthesizers. When [8] proposed automatic digital 
synthesizer circuits. The circuits are generated sounds comparable to a sampled (target) relying on a GA. 
This algorithm initials the population of the elements (individuals) of the circuit and successively improves 
these elements to get a better circuit for generating the target sound. 

The next section demonstrates the previous works of sound synthesis. This section focuses on 
the problems that close to our approach. Section 2 shows a general review of the “Parisian evolution" 
strategy in specific the fly algorithm. Section 3 represents the adaptation of the fly algorithm into 
evolutionary sound synthesis. However, the fly algorithm previously used for medical tomography 
reconstruction, robotic and digital art generator applications. This section follows the result. Finally, 
the remarkable conclusion gives in the last section. 


2. EVOLUTIONARY SOUND SYNTHESIS 

Sound synthesis has been active research for more than four decades. Several techniques for sound 
synthesizers like additive synthesis, subtractive synthesis, frequency modulation, wavetable synthesis or 
physical modeling [9]. Sound synthesis from the image has several implementations; one possible application 
could be expressed using an artist to draw sound [10]. Bragand et al. [10] present a user interface to generate 
sound from image relying on a Voronoi algorithm. Another approach, Photosounder [11] the authors apply 
an inverse Fast Fourier transform (FFT) on the input image to generate sound after they examine the image as 
a magnitude spectrogram. Sound synthesis implementations have manipulated using an evolutionary 
algorithm (EA). In 2001, Garcia and his colleague applied genetic programming (GP) and perceptual distance 
metrics for measuring the distance between the target and produced sounds [12]. 

The researchers in [13] enhance the FM synthesis model with some changes. They rely on more 
waveforms than a sine wave, like a sawtooth wave. For synthesis, this paper uses GA with a fitness function 
that weighted sum of two spectrum-comparison metrics. The crossover selection parameter has shown a 
significant effect on the problem domain. Yong [14] follows Lai and his colleagues for sound synthesis using 
GA. However, he depends on the DFM synthesis model with the same type of fitness function. 
The implementation is done by MatLab to produce an output spectrum according to input (target) sound 
spectrum file. Other researchers [15] have been used method which is cellular automata (CA). This technique 
can be classified as one of a class of evolutionary algorithms for modeling dynamic systems that modify 
several characteristics with time. 

Our work follows the evolutionary sound paradigm. The sound synthesis solved by EA as an 
optimization problem. In specific, our proposed method relies on the fly algorithm, which is a type of 
cooperative co-evolution (CoCo) strategy [16]. In this research, we treat sound synthesis as image 
reconstruction. The principles of our method follow the case of studying sound synthesis as a specific case of 
the set cover problem by placing a group of tiles as a set on a square shape region to convergent a coloured 
image [17]. Here sound synthesis falls under the subject of generating digital mosaic by fitting mosaic tiles 
on a surface. Each tile needs setting up 9 elements (3 colour components, height, width, rotation angle, and 
3-D position), the search space is complex which has 9 N dimensions. Such a problem is a difficult 
optimization problem to solve. In this research, we propose to solve this problem using a type of cooperative 
co-evolution algorithm (CCEA) called “Parisian evolution" [18]. The Parisian approach is differing from 
classical EAs. For the solution of the optimization problem, the EAs are searching on the best one individual 
as the solution. While the CCEA strategy looks on a set or subset of individuals of the population as a final 
solution. This means that every individual is a part of the solution, and all individuals collaborating to build 
the final solution [19]. 


3. OVERVIEW OF THE FLY ALGORITHM 

For solving the sound reconstruction problem, we follow the main mechanics as in Parisian 
evolution. Figure 1 shows the principles of the steady-state of the Parisian evolution algorithm [20]. 
This algorithm similar to classical EA, which contains the usual genetic operators of an EA: selection, 
mutation, and recombination as well as the additional ingredients two types of fitness, as the following: 
- Global fitness evaluated the whole population [21, 22]. 
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- Local fitness evaluated the single contribution of the individual to the global solution. 
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Figure 1. Steady-state cooperative co-evolution strategy. 


In Section 2, we mentioned that our work depended on the Parisian (Fly) strategy. The individuals in 
this algorithm correspond to two types of structures. One corresponds to exceedingly simple primitives: 
The flies [23] represent a 3-D position only. The other structure contains 9-elements as shown in Figure 2: 

- The position is a vector of 3D coordinates (x, y, z), which is randomly generated between 0 and width -1, 
0 and height 1, and 1 and -1 respectively (with width and height the number of pixels in the image along 
the x- and y-axis). An example of an image generated by an initial population has shown in Figure 3. 

- Colour represented by three components (r, g, b) which are red, green and blue respectively. The colour 
elements are randomly generated between O and 1 for precise displaying various colours for each 
individual. 

- Rotation Angle is randomly produced to cover a complete angle between 0 and 360. 

- The scaling factor dominates the size of the tile over the two-dimensional coordinate system through 
the horizontal and vertical axis. 

- Local fitness scales its marginal contribution against the global solution. 
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Figure 2. The fly data structure 





Figure 3. Random initial population 
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The goal of our algorithm is to optimize the 9 elements of all individuals. The proposed method 
aims to minimize the global fitness function. We depend on the sum of absolute error (SAE) which called 
Manhattan distance too [24]. This scale assesses how is a good the population toward the reference sound. 


SAE(RCIINI) = X; X; UNI j) — RCI (1) 


The SAE is compared the reference image INI with the reconstructed image pop. 

While assessing the performance of a single y, we use local fitness. This fitness knows “marginal 
fitness", Fm(i) (see (2)). Our algorithm looks to improve the population performance by increasing the good 
flies against the bad flies. The SAE metrics plus the leave-one-out cross-validation method are used to gauge 
the degree of compatibility of Fly 7 to the reference image. We select a fly that has a good contribution to the 
population and leaves out the bad one. 


Fm(i) = SAE(RCI — {i}, INI) — SAE (RCI, INI) (2) 


With RCI — {1} the image calculated with all individuals except Fly 1. The sign of the value of Fm (1) referred 

to as a different interpretation: 

- sgn(Fm(1)) becomes less than 0 when the difference (error) is greater with Fly 1. This means that the Fly 1 
is incompatible with the optimal solution for all the rest of individuals. 

- sgn(Fm(1)) becomes greater than 0 when the difference (error) is less with Fly 1. This means that the Fly 1 
is converged to the optimal solution. 

= sgn(Fm(1)) becomes equal to 0 when the difference (error) is the same with Fly 1. This means that the Fly 1 
is not valuable nor destructive. 

As the algorithm processing going on as the bad flies decrease. For the selection stage, we use the Threshold 

selection operator [25]. If Fm(i) < 0, then Fly i can be left out; else it is a good candidate for reproduction. 

For stopping criteria of the algorithm, we depend on the algorithm struggle to find bad flies to kill. 


4. RESULTS 

In this section, we design automated sound synthesis through the fly algorithm and perceptual 
distance metrics to measure the distance between reference (ref) and generated (gen) sound. The previous 
section shows this work relying on SAE matrices to quantify the error between the input and target sound. 
For fast processing, we compute the SAE with the help of a graphics processing unit (GPU) using 
the OpenGL shading language (GLSL) as shown in Figure 4. 


Framebuffer 







Fragment shader Framebuffer Boost,compute 


Framebuffer 


Figure 4. Computation of the marginal fitness on GPU for two images using GLSL and OpenGL (in yellow 
and green boxes respectively) 


The gen sound is generated offline using a Frame Buffer Object (FBO). Sounds including ref and 
gen are stored using 2-D OpenGL textures. The next step is that the pixel-wise absolute error between (ref) 
and (gen) is computed after the texture is passed to a GLSL shader program. The process of summation is 
completed on the GPU with the help of the OpenGL application of the reduction operator provided by Boost 
Compute [26]. It efficiently supplies the SAE. Also, our method uses only an operator of mutation for 
generating a good fly and replace with a bad fly during an iteration of the optimization technique. Crossover 
operator excluded to ensure that produced recent fly partially modified from the good previous generation 
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fly. In other words, when we use crossover maybe two good flies producing a new fly in between are very 
likely to be bad. 

Our method is tested with parameters showing in Table 1 and Table 2, using 9 x number of flies-D 
search space. In the first steps of the algorithm, the flies randomly generated using the square tiles as shown 
in Figure 3. The mutation probability is fixed to 100 % due to the crossover is not suitable in our algorithm. 
Our practical part of the algorithm processed automatically. However, the image (and its size) and 
the number of individuals are selected by the user. To avoid slow down the whole process, the user balanced 
the number of flies with image size for avoiding premature convergence. For realistic sound synthesis, our 
algorithm replaces the square tiles with a stripe mask as shown in Figures 5 and 6. We use a twisted wave 
sound editor for reading sound. The Mono sound system used in this research. A waveform sampled at 8000 
Hz. Figure 7 shows an example of a reference sound which is used in this article. 


Table 1. Parameters for the first testing Table 2. Parameters for the second testing 
in the algorithm in the algorithm 
Parameter Value Parameter Value 
Image size 200 x 400 Image size 200 x 400 
no. files 3500 no. files 7500 
no. generations 200000 no. generations 200000 
probability of mutation (Pm) 100% probability of mutation (pm) 100% 
robability of crossover 0.0% robability of crossover 0.0% 
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Figure 5. Mask using stripline 
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Figure 6. Fitting stripe line on square tiles 


The result gradually builds through the fly algorithm iteration as we mention started with 
random stripe Flies. Then, this Fly reconstructs to reach to the target image. Results of Figures 8 and 9 rely 
on Table 1 and Table 2 respectively. Figures 8 and 9 show how flies reconstruction gathering around 
the reference image, as the algorithm executes as the shape getting sharper. The algorithm stops either 
the error getting smaller between the reference and reconstructed image or the algorithm relying on stopping 
conditions. However, the result of Figure 9 shows the converging of the reconstructed image with 
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the reference image more than the result of Figure 8. We conclude that as the number of Flies increases as 
the realistic result you got in the end. 





Figure 7. Reference image 





(2) (h) 


Figure 8: Results of sound reconstruction using the fly algorithm depending on the parameters of Table 1 
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Figure 9. Results of sound reconstruction using the fly algorithm depending on the parameters of Table 2 


5. CONCLUSION 


The proposed method is tackled the problem in the field of Evolutionary sound. The method 
addresses the problem as an image reconstructions algorithm. The algorithm depends on hybrid techniques 
that inherited from AE, scientific computing and computer graphics (CG). The AE represents the fly 
algorithm for reconstructing the sound template. For generating the sound data, real-time CG rendering and 


graphics processing unit (GPU) are used to compute the fitness function. The algorithm uses stripe tiles to 
create sound visual effects. 
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